scrm: efficiently simulating long sequences using the approximated coalescent with recombination
نویسندگان
چکیده
MOTIVATION Coalescent-based simulation software for genomic sequences allows the efficient in silico generation of short- and medium-sized genetic sequences. However, the simulation of genome-size datasets as produced by next-generation sequencing is currently only possible using fairly crude approximations. RESULTS We present the sequential coalescent with recombination model (SCRM), a new method that efficiently and accurately approximates the coalescent with recombination, closing the gap between current approximations and the exact model. We present an efficient implementation and show that it can simulate genomic-scale datasets with an essentially correct linkage structure.
منابع مشابه
Approaching Long Genomic Regions and Large Recombination Rates with msParSm as an Alternative to MaCS
The msParSm application is an evolution of msPar, the parallel version of the coalescent simulation program ms, which removes the limitation for simulating long stretches of DNA sequences with large recombination rates, without compromising the accuracy of the standard coalescence. This work introduces msParSm, describes its significant performance improvements over msPar and its shared memory ...
متن کاملCosi2: an efficient simulator of exact and approximate coalescent with selection
MOTIVATION Efficient simulation of population genetic samples under a given demographic model is a prerequisite for many analyses. Coalescent theory provides an efficient framework for such simulations, but simulating longer regions and higher recombination rates remains challenging. Simulators based on a Markovian approximation to the coalescent scale well, but do not support simulation of sel...
متن کاملRecombination as a point process along sequences.
Histories of sequences in the coalescent model with recombination can be simulated using an algorithm that takes as input a sample of extant sequences. The algorithm traces the history of the sequences going back in time, encountering recombinations and coalescence (duplications) until the ancestral material is located on one sequence for homologous positions in the present sequences. Here an a...
متن کاملBioinformatics Advance Access published April 25 , 2007
2 structure summarizing coalescent events for each portion of the sequence. To allow for population stratification or other constraints on mating, we define a set of rules that can be used to relate each individual sequence (a row in one of the sparse matrices) to its ancestors in the previous generation (one or more rows in the second sparse matrix). Since we simulate all intervening generatio...
متن کاملHybridsim: Simulator for generating allele data in isolated populations
We propose a novel two-phase population simulator for generating diploid marker allele data in isolated populations. Our simulator extends Populus, an exact forward-in-time simulator for isolated populations, with a coalescent simulator. The coalescent, while not suitable for completely replacing Populus, is an useful substitute for the over-simpli ed founder generator used in Populus. We belie...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 31 شماره
صفحات -
تاریخ انتشار 2015